Goto

Collaborating Authors

 encoder part




Satellite imagery segmentation using U-NET

#artificialintelligence

In this blog, we will conduct picture segmentation on a very limited dataset using U-Net, a popular segmentation CNN model. There will also be some customized loss functions used for training reasons, such as dice loss and Jaccard index metrics. The data that we will be working with comes from kaggle. The dataset is called Semantic segmentation of aerial imagery. The dataset has two sorts of files .jpg


Text Classification using Transformers

#artificialintelligence

In this part, we will try to understand the Encoder-Decoder architecture of the Multi-Head Self-Attention Transformer network with some code in PyTorch. There won't be any theory involved(better theoretical version can be found here) just the barebones of the network and how can one write this network on its own in PyTorch. The architecture comprising the Transformer model is divided into two parts -- the Encoder part and the Decoder part. Several other things combine to form the Encoder and Decoder parts. Let's start with the Encoder.


A Reinforcement Learning Based Encoder-Decoder Framework for Learning Stock Trading Rules

arXiv.org Artificial Intelligence

A wide variety of deep reinforcement learning (DRL) models have recently been proposed to learn profitable investment strategies. The rules learned by these models outperform the previous strategies specially in high frequency trading environments. However, it is shown that the quality of the extracted features from a long-term sequence of raw prices of the instruments greatly affects the performance of the trading rules learned by these models. Employing a neural encoder-decoder structure to extract informative features from complex input time-series has proved very effective in other popular tasks like neural machine translation and video captioning in which the models face a similar problem. The encoder-decoder framework extracts highly informative features from a long sequence of prices along with learning how to generate outputs based on the extracted features. In this paper, a novel end-to-end model based on the neural encoder-decoder framework combined with DRL is proposed to learn single instrument trading strategies from a long sequence of raw prices of the instrument. The proposed model consists of an encoder which is a neural structure responsible for learning informative features from the input sequence, and a decoder which is a DRL model responsible for learning profitable strategies based on the features extracted by the encoder. The parameters of the encoder and the decoder structures are learned jointly, which enables the encoder to extract features fitted to the task of the decoder DRL. In addition, the effects of different structures for the encoder and various forms of the input sequences on the performance of the learned strategies are investigated. Experimental results showed that the proposed model outperforms other state-of-the-art models in highly dynamic environments.


Understanding Autoencoders with Information Theoretic Concepts

arXiv.org Machine Learning

Despite their great success in practical applications, there is still a lack of theoretical and systematic methods to analyze deep neural networks. In this paper, we illustrate an advanced information theoretic methodology to understand the dynamics of learning and the design of autoencoders, a special type of deep learning architectures that resembles a communication channel. By generalizing the information plane to any cost function, and inspecting the roles and dynamics of different layers using layer-wise information quantities, we emphasize the role that mutual information plays in quantifying learning from data. We further propose and also experimentally validate, for mean square error training, two hypotheses regarding the layer-wise flow of information and intrinsic dimensionality of the bottleneck layer, using respectively the data processing inequality and the identification of a bifurcation point in the information plane that is controlled by the given data. Our observations have direct impact on the optimal design of autoencoders, the design of alternative feedforward training methods, and even in the problem of generalization.


Autoencoder

#artificialintelligence

Goal Autoencoder have long been proposed to tackle the problem of unsupervised learning. In this week's summary we have a look at their capabilities of providing a features that can be successfully used in supervised tasks and sketch their framework architecture. Motivation In supervised learning, back in the days, deeper architectures need some kind of pretraining of layers before the actual supervised tasked could be pursued. Autoencoder came in handy for this and allowed to train one layer after the other and were able to find useful features for the supervised learning. Steps Let us start by looking at the general architecture.